feat: Add Sherpa ONNX backend for ASR and TTS by richiejp · Pull Request #8523 · mudler/LocalAI

richiejp · 2026-02-12T11:51:40Z

The Sherpa backend can handle... almost everything related to voice. So far with have VAD, ASR, TTS. It should be relatively simple to add wake words, diarization etc. However I've reached a point where there is so much stuff to test that I'm just going to add one model we don't already have (Ominiligual ASR) and go towards just trying to get the backend initially merged and then expand on testing.

Sherpa supports a lot of models we already have Python backends for, but at a fraction of the size because it is all based on ONNX. We also have ONNX backends already, but it's not clear that we have GPU acceleration for all of those.

netlify · 2026-02-12T11:51:46Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`ffc5018`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/69949097c49885000868f16b
😎 Deploy Preview	https://deploy-preview-8523--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

mudler · 2026-04-21T09:46:41Z

seems I've completely missed this, sorry @richiejp !

richiejp · 2026-04-21T09:57:30Z

No, problem, I put it on hold while I was blocked on testing other stuff, but can reboot it now. The main issue with this backend is testing, it has a huge feature/api surface.

mudler · 2026-04-21T19:47:47Z

No, problem, I put it on hold while I was blocked on testing other stuff, but can reboot it now. The main issue with this backend is testing, it has a huge feature/api surface.

👍 I see yep, would make sense then to try out pointing claude at https://github.com/mudler/LocalAI/tree/master/tests/e2e-backends as we have already a "small" suite e2e for backends directly by calling via gRPC - this basically skip all the API e2e tests and jump directly to exercise the backend. It usually is very good at doing test scaffolding in order to test. worth a shot

mudler · 2026-04-21T19:48:44Z

+	pb "github.com/mudler/LocalAI/pkg/grpc/proto"
+)
+
+func TestSherpaBackendStruct(t *testing.T) {


would be nice to use ginkgo here for consistency

mudler · 2026-04-21T19:49:27Z

+	"os/exec"
+	"path/filepath"
+	"strings"
+	"testing"


mudler · 2026-04-21T19:49:49Z

+package main
+
+/*
+#cgo LDFLAGS: -lsherpa-onnx-c-api -lonnxruntime -lstdc++


no purego here?

Adds a new Go backend wrapping sherpa-onnx via purego (no cgo). Same approach as opus/stablediffusion-ggml/whisper — a thin C shim (csrc/shim.c + shim.h → libsherpa-shim.so) wraps the bits purego can't reach directly: nested struct config writes, result-struct field reads, and the streaming TTS callback trampoline. The Go side uses opaque uintptr handles and purego.NewCallback for the TTS callback. Supports: - VAD via sherpa-onnx's Silero VAD - Offline ASR: Whisper, Paraformer, SenseVoice, Omnilingual CTC - Online/streaming ASR: zipformer transducer with endpoint detection (AudioTranscriptionStream emits delta events during decode) - Offline TTS: VITS (LJS, etc.) - Streaming TTS: sherpa-onnx's callback API → PCM chunks on a channel, prefixed by a streaming WAV header Gallery entries: omnilingual-0.3b-ctc-q8-sherpa (1600-language offline ASR), streaming-zipformer-en-sherpa (low-latency streaming ASR), silero-vad-sherpa, vits-ljs-sherpa. E2E coverage: tests/e2e-backends for offline + streaming ASR, tests/e2e for the full realtime pipeline (VAD + STT + TTS). Assisted-by: claude-opus-4-7-1M [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com>

richiejp · 2026-04-24T12:23:52Z

No, problem, I put it on hold while I was blocked on testing other stuff, but can reboot it now. The main issue with this backend is testing, it has a huge feature/api surface.

👍 I see yep, would make sense then to try out pointing claude at https://github.com/mudler/LocalAI/tree/master/tests/e2e-backends as we have already a "small" suite e2e for backends directly by calling via gRPC - this basically skip all the API e2e tests and jump directly to exercise the backend. It usually is very good at doing test scaffolding in order to test. worth a shot

I think it should be using e2e-backends now for gRPC level tests. There are e2e tests based on the realtime API and lower level tests in the backend source as well so that we have a 3 tier approach.

Hopefully this is ready to go now. Next I'd want to use the real streaming for ASR and TTS in the realtime API. Also there are still features in Sherpa that haven't been exposed.

I have to say though that I am not a huge fan of ONNX, it seems harder to package than GGML based backends if you want all of the GPUs to work and I haven't even tried here, just using CUDA and CPU.

richiejp force-pushed the feat/add-sherpa-backend branch from 5f176ef to f705e60 Compare February 13, 2026 17:06

github-actions Bot added the dependencies label Feb 13, 2026

richiejp force-pushed the feat/add-sherpa-backend branch from f705e60 to c696f1c Compare February 17, 2026 09:09

github-actions Bot added the area/ai-model label Feb 17, 2026

richiejp marked this pull request as ready for review February 17, 2026 09:19

richiejp force-pushed the feat/add-sherpa-backend branch 6 times, most recently from 2c97d0b to ffc5018 Compare February 17, 2026 16:00

mudler added the needs-review label Apr 21, 2026

richiejp force-pushed the feat/add-sherpa-backend branch from ffc5018 to 85339fa Compare April 21, 2026 11:52

mudler previously approved these changes Apr 21, 2026

View reviewed changes

mudler reviewed Apr 21, 2026

View reviewed changes

richiejp dismissed mudler’s stale review via 0ca14ad April 22, 2026 10:32

richiejp force-pushed the feat/add-sherpa-backend branch 5 times, most recently from 29c5906 to ad9f1c7 Compare April 24, 2026 10:42

richiejp force-pushed the feat/add-sherpa-backend branch from ad9f1c7 to fe68e6a Compare April 24, 2026 11:39

mudler approved these changes Apr 24, 2026

View reviewed changes

mudler merged commit 13734ae into mudler:master Apr 24, 2026
17 checks passed

mudler added the enhancement New feature or request label Apr 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Sherpa ONNX backend for ASR and TTS#8523

feat: Add Sherpa ONNX backend for ASR and TTS#8523
mudler merged 1 commit intomudler:masterfrom
richiejp:feat/add-sherpa-backend

richiejp commented Feb 12, 2026 •

edited

Loading

Uh oh!

netlify Bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

mudler commented Apr 21, 2026

Uh oh!

richiejp commented Apr 21, 2026

Uh oh!

mudler commented Apr 21, 2026

Uh oh!

mudler Apr 21, 2026

Uh oh!

mudler Apr 21, 2026

Uh oh!

mudler Apr 21, 2026

Uh oh!

richiejp commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

richiejp commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify Bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for localai ready!

Uh oh!

mudler commented Apr 21, 2026

Uh oh!

richiejp commented Apr 21, 2026

Uh oh!

mudler commented Apr 21, 2026

Uh oh!

mudler Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

mudler Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

mudler Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

richiejp commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richiejp commented Feb 12, 2026 •

edited

Loading

netlify Bot commented Feb 12, 2026 •

edited

Loading